Pseudo-morpheme and Confusion Network Based Korean-english Statistical Spoken Language Translation System

نویسندگان

  • Donghyeon Lee
  • Jonghoon Lee
  • Gary Geunbae Lee
چکیده

In this demonstration, we present POSSLT (POSTECH Spoken Language Translation) for a Korean-English statistical spoken language translation (SLT) system using pseudo-morpheme and confusion network (CN) based technique. Like most other SLT systems, automatic speech recognition (ASR) and machine translation (MT) are coupled in a cascading manner in our SLT system. We used confusion network based approach to couple ASR and MT. It has better translation quality and faster decoding time than N-best approach. In the ASR and SMT for Korean, how to define processing units affects the performance. Pseudo-morpheme unit is a best choice for Korean-English SLT. Models used in SLT system are trained on a travel domain conversational corpus.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applying Statistical Post-Editing to English-to-Korean Rule-based Machine Translation System

Conventional rule-based machine translation system suffers from its weakness of fluency in the view of target language generation. In particular, when translating English spoken language to Korean, the fluency of translation result is as important as adequacy in the aspect of readability and understanding. This problem is more severe in language pairs such as English-Korean. It’s because Englis...

متن کامل

The IRST English-Spanish translation system for european parliament speeches

This paper presents the spoken language translation system developed at FBK-irst during the TC-STAR project. The system integrates automatic speech recognition with machine translation through the use of confusion networks, which permit to represent a huge number of transcription hypotheses generated by the speech recognizer. Confusion networks are efficiently decoded by a statistical machine t...

متن کامل

Integrating connectionist, statistical and symbolic approaches for continuous spoken Korean processing

This paper presents a multi-strategic and hybrid approach for large-scale integrated speech and natural language processing, employing connectionist, statistical and symbolic techniques. The developed spoken Korean processing engine (SKOPE) integrates connectionist TDNN-based phoneme recognition technique with statistical Viterbi-based lexical decoding and symbolic morphological/phonological an...

متن کامل

A Hybrid Morpheme-Word Representation for Machine Translation of Morphologically Rich Languages

We propose a language-independent approach for improving statistical machine translation for morphologically rich languages using a hybrid morpheme-word representation where the basic unit of translation is the morpheme, but word boundaries are respected at all stages of the translation process. Our model extends the classic phrase-based model by means of (1) word boundary-aware morpheme-level ...

متن کامل

Sign Language TranslationSystem ( TeST ) - Morpheme Converting Rules for Korean PredicatesSeok

There are two kinds of sign languages used in Korea. One is Korean Sign Language , which has the identical rules with that of Korean grammar, the other is Korean Natural Sign Language(KnSL), which is actually used by deaf person. KnSL diiers from Korean Language in morpheme characteristics. Therefore, morpheme translation rules must be de-ned rst for translation between these two. So, in this p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007